For the Automated Mark-Up of Italian Legislative Texts in XML
نویسندگان
چکیده
In this paper we will present a method for mining information within legal texts, in particular in regards to corpora of statutes. Text mining, or more in general Information Extraction, can provide a valuable help to people involved in research about the linguistic structure of statutes, and, as a side effect can be the seed for a new generation of applications for the validation and conversion in the legislative domain.
منابع مشابه
Formal Models for a Legislative Grammar. Explicit Text Amendment
In this paper we will present a method for mining information within legal texts, in particular in regards to corpora of statutes. Text mining, or more in general Information Extraction, can provide a valuable help to people involved in research about the linguistic structure of statutes, and, as a side effect can be the seed for a new generation of applications for the validation and conversio...
متن کاملSemantic Mark-up of Italian Legal Texts Through NLP-based Techniques
In this paper we illustrate an approach to information extraction from legal texts using SALEM. SALEM is an NLP architecture for semantic annotation and indexing of Italian legislative texts, developed by ILC in close collaboration with ITTIG-CNR, Florence. Results of SALEM performance on a test sample of about 500 Italian law paragraphs are provided.
متن کاملConstructing and exploiting an automatically annotated resource of legislative texts
In this paper, we report on the construction of a resource of Swiss legislative texts that is automatically annotated with structural, morphosyntactic and content-related information, and we discuss the exploitation of this resource for the purposes of legislative drafting, legal linguistics and translation and for the evaluation of legislation. Our resource is based on the classified compilati...
متن کاملD2d - A Robust Front-end for Prototyping, Authoring and Maintaining XML Encoded Documents by Domain Experts
In many cases, domain experts are used to write down their knowledge in contiguous texts. A standard way to facilitate the automated processing of such texts is to add mark-up, for which the family of XML-based standards is current best practice. But the default textual appearance of XML mark-up is not suited to be typed, read and edited by humans. The authors’ d2d notation provides an alternat...
متن کاملCapturing Coercions in Texts: a First Annotation Exercise
In this paper we report the first results of an annotation exercise of argument coercion phenomena performed on Italian texts. Our corpus consists of ca 4000 sentences from the PAROLE sottoinsieme corpus (Bindi et al. 2000) annotated with Selection and Coercion relations among verb-noun pairs formatted in XML according to the Generative Lexicon Mark-up Language (GLML) format (Pustejovsky et al....
متن کامل